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Oceanithermus profundus Miroshnichenko et al. 2003 is the type species of the genus Ocea- 
nithermus, which belongs to the family Thermaceae. The genus currently comprises two spe- 
cies whose members are thermophilic and are able to reduce sulfur compounds and nitrite. 
The organism is adapted to the salinity of sea water, is able to utilize a broad range of carbo- 
hydrates, some proteinaceous substrates, organic acids and alcohols. This is the first com- 
pleted genome sequence of a member of the genus Oceanithermus and the fourth sequence 
from the family Thermaceae. The 2,439,291 bp long genome with its 2,391 protein-coding 
and 54 RNA genes consists of one chromosome and a 135,351 bp long plasmid, and is a part 
of the Genomic Encyclopedia of Bacteria and Archaea project. 
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Introduction 



Strain 506 T (DSM 14977 = NBRC 100410 = VKM 
B-2274) is the type strain of Oceanithermus pro- 
fundus, which is the type species of the genus 
Oceanithermus [1] of the family Thermaceae [2]. 
Together with 0. desulfurans, there are currently 
two species placed in the genus [1,3]. The generic 
name derives from the Latin noun oceanus, mean- 
ing ocean and the Neo-Latin masc. substantive 
(from Gr. adj. thermos) thermus which means hot. 
Therefore, the name Oceanithermus refers to 
warmth-loving organisms living in the ocean. The 
species epithet is derived from the Latin adjective 



profundus meaning deep, which means pertaining 
to the abyss, pertaining to the depths of the ocean 
[1]. Strain 506 T was first isolated from samples of 
hydrothermal fluids and chimneys collected at the 
13 Q N hydrothermal vent field on the East Pacific 
Rise at a depth of 2600 m [1]. There are no further 
cultivated strains of this species known. The other 
member of the genus, 0. desulfurans, is a thermo- 
philic, sulfur-reducing bacterium isolated from a 
sulfide chimney in Suiyo Seamount, in the West- 
ern Pacific [3]. Here we present a summary classi- 
fication and a set of features for 0. profundus 506 T , 
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together with the description of the complete ge- 
nomic sequencing and annotation. 

Classification and features 

A representative genomic 16S rRNA sequence of 
strain 506 T was compared using NCBI BLAST under 
default settings (e.g., considering only the high- 
scoring segment pairs (HSPs) from the best 250 
hits) with the most recent release of the Green- 
genes database [4] and the relative frequencies, 
weighted by BLAST scores, of taxa and keywords 
(reduced to their stem) [5] were determined. The 
five most frequent genera were Thermus (52.0%), 
Meiothermus (37.0%), Oceanithermus (7.6%), Ma- 
rinithermus (2.0%) and Vulcanithermus (1.4%) 
(156 hits in total). Regarding the four hits to se- 
quences from members of the species, the average 
identity within HSPs was 99.6%, whereas the aver- 
age coverage by HSPs was 94.8%. Regarding the 
two hits to sequences from other members of the 
genus, the average identity within HSPs was 99.3%, 
whereas the average coverage by HSPs was 91.0%. 
Among all other species, the one yielding the high- 
est score was 0. desulfurans, which corresponded 
to an identity of 99.3% and an HSP coverage of 
91.0%. The highest-scoring environmental se- 
quence was EU555123 ('Microbial Sulfide Hydro- 
thermal Vent Field Juan de Fuca Ridge Dudley hy- 
drothermal vent clone 4132B16'), which showed 
an identity of 99.1% and an HSP coverage of 98.0%. 
The five most frequent keywords within the labels 
of environmental samples which yielded hits were 
'spring' (8.2%), 'hot' (6.2%), 'microbi' (4.5%), 
'geochem, nation, park, yellowston' (2.8%) and 
'hydrotherm/vent' (2.5%) (94 hits in total). The 
five most frequent keywords within the labels of 
environmental samples which yielded hits of a 
higher score than the highest scoring species were 
'hydrotherm/vent' (12.2%), 'field, microbi, ridg' 
(6.1%), 'fluid' (5.9%), 'dudlei, fuca, juan, sulfid' 
(3.1%) and 'degre, east, north, ocean, pacif, rise' 
(3.0%) (3 hits in total). These 16S BLAST results 
are a confirmation of the kind of environment from 
which the living strain was isolated and therefore 
fits the description of the isolate. 

Figure 1 shows the phylogenetic neighborhood of 0. 
profundus in a 16S rRNA based tree. The sequences 
of the two identical 16S rRNA gene copies in the ge- 
nome differ by one nucleotide from the previously 
published 16S rRNA sequence (AJ430586). 

The cells of 0. profundus are described as non- 
motile, rod-shaped, 0.5 - 0.7 |im in diameter and of 



various lengths (Figure 2). When grown on protei- 
naceous substrates, old cultures of 0. profundus form 
filaments and large spheres resembling the 'rotund 
bodies' typical of aged cells of Thermus species 
[1,15]. The organism is Gram-negative and non 
spore-forming (Table 1). 

0. profundus is microaerophilic, only being able to 
grow at oxygen concentrations below 6% [1]. No 
growth has been observed in an atmosphere of air, 
either in liquid medium or on plates. In an agar tube 
containing 5 ml of basal medium supplemented with 
2 g sucrose and 1 g tryptone per liter with air in the 
headspace (10 ml), growth occurs in a zone located 
20 mm below the agar/air interface [1]. Alternative- 
ly, the organism grows anaerobically using nitrate as 
the electron acceptor. 0. profundus grows within a 
temperature range of 40-68 Q C, optimal growth being 
observed at 60 Q C. At 60 Q C, it grows between pH 5.5 
and 8.4, with an optimum around 7.5 [1]. Strain 506 T 
grows at NaCl concentrations ranging from 10 to 50 
g/1, with an optimum at 30 g/1 [1]. The organism is 
oxidase- and catalase positive and is able to utilize a 
wide spectrum of carbohydrates in the presence of 
either nitrate or oxygen [1]. The highest cell yield is 
observed in the presence of nitrate with fructose, 
maltose, sucrose, trehalose, galactose, rhamnose or 
xylose. Glucose, lactose and starch are utilized, but 
no growth has been reported with ribose, galactose, 
arabinose, dextrin or cellobiose [1]. Acetate and 
propionate are produced during growth with su- 
crose as a growth substrate and nitrate as the elec- 
tron acceptor. Nitrite is the only product of denitrifi- 
cation [1]. 0. profundus grows well with complex 
proteinaceous substrates such as beef extract, tryp- 
tone or papaic digest of soybean (1-1.5 g/1). Howev- 
er, growth is strongly inhibited by higher concentra- 
tions of these substrates [1]. The isolate does not 
grow with Casamino acids or yeast extract as sole 
sources of carbon and energy, though 100 mg/1 
yeast extract is required for growth [1]. 0. profundus 
is able to utilize acetate, pyruvate and propionate as 
growth substrates. It also grows with methanol, 
ethanol and mannitol, though the cell yield is lower 
[1]. 0. profundus is able to grow lithoheterotrophi- 
cally using molecular hydrogen as the energy source, 
yeast extract as the carbon source and nitrate as the 
electron acceptor. Other electron acceptors (sulfate, 
elemental sulfur, thiosulfate and nitrite) do not sup- 
port growth, regardless of growth substrate [1]. De- 
tailed studies on the metabolism of maltose, acetate, 
pyruvate, and hydrogen have been undertaken by 
Fedosov et a/. [26]. 
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Thermus igniterrae (Y18406) 
Thermus brockianus (Z15062) 
Thermus antranikianii (Y18411) 
Thermus scotoductus (AF032127) 

— Thermus filiformis (X58345) 
Thermus aquaticus (L09663) 
— Thermus islandicus (EU753247) 



97 



83 — Thermus arciformis (EU247889) 
— Thermus thermophilus (X07998) 
Thermus oshimai (Y18416) 



Marinithermus hydrothermalis (AB079382) ' 



Oceanithermus profundus (IMG2503535871) ' 
100 

Oceanithermus desulfurans (AB107956) 



— Vulcanithermus mediatlanticus (AJ507298) ' 
Meiothermus taiwanensis (AF418001) 
Meiothermus catenitormans (EU247891) 
Meiothermus ruber (Z15059) " 
Meiothermus cerbereus (Y13594) 
— Meiothermus rufus (FN1 78496) 

— Meiothermus granaticius (GU584097) 
Meiothermus timidus (AJ871 1 68) 

— Meiothermus chliarophilus (X84212) 



Meiothermus silvanus (X84211) " 



Deinococcus radiodurans (Y1 1332) 



Truepera radiovictrix (DQ022076) 



0.04 



Figure 1. Phylogenetic tree highlighting the position of O. profundus relative to the other type strains 
within the family Thermaceae. The tree was inferred from 1 ,420 aligned characters [6,7] of the 1 6S rRNA 
gene sequence under the maximum likelihood criterion [8]. Rooting was initially done using the mid- 
point method [9] and then checked for its accordance with the current taxonomy (see Table 1) and 
rooted accordingly. The branches are scaled in terms of the expected number of substitutions per site. 
Numbers to the right of bifurcations are support values from 1,000 bootstrap replicates [10] if larger than 
60%. Lineages with type strain genome sequencing projects that are registered in GOLD [1 1] but remain 
unpublished are labeled with one asterisk, published genomes with two asterisks [12-14]. 
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MIGS-22 



MIGS-6 

MIGS-15 

MIGS-14 



MIGS-4 

MIGS-5 

MIGS-4.1 

MIGS-4.2 

MIGS-4.3 

MIGS-4.4 



Current classification 



Table 1 . Classification and general features of O. profundus 506 T according to the MIGS recommendations [16]. 

MIGS ID Property Term Evidence code 

Domain Bacteria TAS [1 7] 

Phylum " Deinococcus-Thermus" TAS [18,19] 

Class Deinococci TAS [20,21] 

Order Thermales TAS [21,22] 

Family Thermaceae TAS [21,23] 

Genus Oceanithermus TAS [1] 

Species Oceanithermus profundus TAS [1] 

Type strain 506 TAS [1] 

negative TAS [1] 

rod-shaped TAS [1] 

non-motile TAS [1] 

none TAS [1] 

40-68°C TAS[1] 

60°C TAS[1] 

1 %-5%, optimum 3% NaCI TAS [1 ] 

microaerophile TAS [1] 

carbohydrates TAS [1] 

chemoorganoheterotroph, lithoheterotroph, organotroph TAS [1] 

deep sea, hydrothermal vent, marine TAS [1] 

free-living TAS [1] 

none NAS 

1 NAS [24] 

deep-sea hot vent TAS [1] 

East Pacific Rise TAS [1] 

1999 TAS[1] 

12.8 TAS[1] 

103.93 TAS[1] 

2,600 m TAS[1] 

-2,600 m NAS 



Gram stain 

Cell shape 

Motility 

Sporulation 

Temperature range 

Optimum temperature 

Salinity 

Oxygen requirement 
Carbon source 
Energy metabolism 

Habitat 

Biotic relationship 
Pathogenicity 
Biosafety level 
Isolation 

Geographic location 

Sample collection time 

Latitude 

Longitude 

Depth 

Altitude 



Evidence codes - IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a 
direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, 
isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence 
codes are from of the Gene Ontology project [25] If the evidence code is IDA, then the property was directly observed 
by one of the authors or an expert mentioned in the acknowledgements. 



Chemotaxonomy 

The polar lipid pattern of strain 506 T comprises 
three phospholipids, whereas glycolipids have not 
been detected [1]. This differentiates the organism 
from members of the genera Vulcanithermus, 
Rhabdothermus, Thermus and Meiothermus, where 
phospholipids and glycolipids have both been de- 
tected [27,28]. It should be noted that the major 



phospholipid detected in 0. profundus has the 
same Rf and staining behavior as the 2'-0-(l, 2- 
diacyl-sn-glycero-3-phospho)-3'-0-(a-N-acetyl- 
glucosaminyl)-N-glyceroyl alkylamine reported to 
occur in members of the genera Meiothermus and 
Thermus [29]. On the basis of Rf value and staining 
behavior this lipid also appears to be present in 
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members of the genera Vulcanithermus and Rhab- 
dothermus, which also synthesize glycolipids 
[30,31] Although members of the genus Deinococ- 
cus may also produce glycolipids in addition to a 
novel series of phosphoglycolipids [32,33] the lat- 
ter are absent in members of the genera Thermits 
and Meiothermus. The absence of glycolipids was 
one of the arguments for Miroshnichenko et al. for 
placing strain 506 T in a new genus [1]. 

Menaquinones are the sole respiratory lipoqui- 
nones detected, with MK-8 predominating (95%) 
and MK-9 being present in smaller proportions 
(5%) [1]. The predominance of MK-8 is consis- 
tent with reports of MK-8 in members of the ge- 
nera Thermus, Meiothermus [34,35], Marinither- 
mus [36] Vulcanithermus, Rhabdothermus, Tru- 
epera, Deinobacterium and Deinococcus [30- 
33,37]. However, the presence of MK-9, albeit at 
only 5%, appears to be a unique feature of 0. pro- 
fundus. 

The fatty acids comprise mainly iso- and anteiso- 
branched fatty acids though /so-unsaturated fatty 
acids are also present [1]. The major fatty acids 
are /so-Ci5 : iO)7 (7.7%), iso-Cis.o (33.2%), iso- 

Cl6:lO)8 (2.6 Z5O-Cl6:0 (3.3%), Z50-Cl7:l(07C (18.8%), 

iso-Cn.o (12.3%), anteiso-Cis-o (5.1%) and antei- 
so-Cn-o (5.4%) [1]. The presence of iso- and an- 
te/so-branched fatty acids is a feature of mem- 
bers of the genera Deinococcus, Thermus, Meio- 
thermus, Vulcanithermus, Rhabdothermus and 
Marinithermus [27,28,30-34,37]. The presence of 
unsaturated branched-chain fatty acids is a dis- 
tinctive feature of members of the genera Oceani- 
thermus, Vulcanithermus and Rhabdothermus 
within the family Thermaceae. The unsaturated 
fatty acid content of the isolate is also higher (33- 
37%) as compared to the closest relative 0. desul- 
furans (18%) [3]. 

Genome sequencing and annotation 

Genome project history 

This organism was selected for sequencing on the 
basis of its phylogenetic position [38] and is part 
of the Genomic Encyclopedia of Bacteria and Arc- 
haea project [39]. The genome project is depo- 
sited in the Genome On Line Database [11] and the 
complete genome sequence is deposited in Gen- 
Bank. Sequencing, finishing and annotation were 
performed by the DOE Joint Genome Institute 
(JGI). A summary of the project information is 
shown in Table 2. 



Growth conditions and DNA isolation 

0. profundus strain 506 T , DSM 14977, was grown 
anaerobically in DSMZ medium 975 [Oceanither- 
mus profundus medium) [40] at 60°C. DNA was 
isolated from 0.5-1 g of cell paste using Jetflex Ge- 
nomic DNA Purification Kit following the standard 
protocol as recommended by the manufacturer, 
but with an additional proteinase K (20 ul) diges- 
tion for 45 min at 58°C. DNA is available through 
the DNA Bank Network [41]. 

Genome sequencing and assembly 

The genome was sequenced using a combination 
of Illumina and 454 sequencing platforms. All 
general aspects of library construction and se- 
quencing can be found at the JGI website [42]. Py- 
rosequencing reads were assembled using the 
Newbler assembler version 2.3-PreRelease-8-23- 
2009 (Roche). The initial Newbler assembly, con- 
sisting of nine contigs in four scaffolds, was con- 
verted into a phrap assembly by [43] making fake 
reads from the consensus, to collect the read pairs 
in the 454 paired end library. Illumina GAii se- 
quencing data (208 Mb) was assembled with Vel- 
vet [44] and the consensus sequences were 
shredded into 1.5 kb overlapped fake reads and 
assembled together with the 454 data. The 454 
draft assembly was based on 306.1 Mb 454 draft 
data and all of the 454 paired end data. Newbler 
parameters are -consed -a 50 -1 350 -g -m -ml 20. 
The Phred/Phrap/Consed software package [43] 
was used for sequence assembly and quality as- 
sessment in the subsequent finishing process. Af- 
ter the shotgun stage, reads were assembled with 
parallel phrap (High Performance Software, LLC). 
Possible mis-assemblies were corrected with gap- 
Resolution [42], Dupfinisher, or sequencing 
cloned bridging PCR fragments with subcloning or 
transposon bombing (Epicentre Biotechnologies, 
Madison, WI) [45]. Gaps between contigs were 
closed by editing in Consed, by PCR and by Bubble 
PCR primer walks (J.-F.Chang, unpublished). A to- 
tal of 177 additional reactions were necessary to 
close gaps and to raise the quality of the finished 
sequence. Illumina reads were also used to correct 
potential base errors and increase consensus 
quality using a software Polisher developed at JGI 
[46]. The error rate of the completed genome se- 
quence is less than 1 in 100,000. Together, the 
combination of the Illumina and 454 sequencing 
platforms provided 282.8 x coverage of the ge- 
nome. The final assembly contained 1,258,374 
pyrosequence and 5,792,221 Illumina reads. 
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Table 2. Genome sequencing project information 



MIGS ID 


Property 


Term 


MIGS-31 


Finishing quality 


Finished 


MIGb-28 


Libraries used 


Three genomic libraries: one 454 pyrosequence standard li- 
brary, one 454 PE library (1 7 kb insert size), one lllumina library 


MIGS-29 


Sequencing platforms 


lllumina GAii, 454 GS FLX Titanium 


MIGS-31. 2 


Sequencing coverage 


85.5 x lllumina; 197.3 x pyrosequence 


MIGS-30 


Assemblers 


Newbler version 2.3-PreRelease-8-23-2009, Velvet, phrap 


MIGS-32 


Gene calling method 


Prodigal 1.4, GenePRIMP 




INSDC ID 


CP002361 chromosome 




CP002362 plasmid OCEPR01 




Genbank Date of Release 


December 7, 2010 




GOLD ID 


Gc01553 




NCBI project ID 


40223 




Database: IMG-GEBA 


2503508010 


MIGS-13 


Source material identifier 


DSM 14977 




Project relevance 


Tree of Life, GEBA 



Genome annotation 

Genes were identified using Prodigal [47] as part 
of the Oak Ridge National Laboratory genome an- 
notation pipeline, followed by a round of manual 
curation using the JGI GenePRIMP pipeline [48]. 
The predicted CDSs were translated and used to 
search the National Center for Biotechnology In- 
formation (NCBI) nonredundant database, Uni- 
Prot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and In- 
terPro databases. Additional gene prediction anal- 
ysis and functional annotation was performed 
within the Integrated Microbial Genomes - Expert 
Review (IMG-ER) [49]. 



Genome properties 

The genome consists of a 2,303,940 bp long chro- 
mosome with a G+C content of 70% and a 135,351 
bp plasmid with a G+C content of 66% (Table 3 
and Figure 3). Of the 2,445 genes predicted, 2,391 
were protein-coding genes, and 54 RNAs; 18 
pseudogenes were also identified. The majority of 
the protein-coding genes (69.9%) were assigned 
with a putative function while the remaining ones 
were annotated as hypothetical proteins. The dis- 
tribution of genes into COGs functional categories 
is presented in Table 4. 



Table 3. Genome Statistics 



Attribute 


Value 


% of Total 


Genome size (bp) 


2,439,291 


100.00% 


DNA coding region (bp) 


2,265,747 


92.89% 


DNAG+C content (bp) 


1,702,985 


69.81% 


Number of replicons 


2 




Extrachromosomal elements 


1 




Total genes 


2,445 


100.00% 


RNA genes 


54 


2.21% 


rRNA operons 


2 




Protein-coding genes 


2,391 


97.79% 


Pseudo genes 


18 


0.74% 


Genes with function prediction 


1,709 


69.90% 


Genes in paralog clusters 


25 


1 .02% 


Genes assigned to COGs 


1,772 


72.47% 


Genes assigned Pfam domains 


1,842 


75.34% 


Genes with signal peptides 


615 


25.15% 


Genes with transmembrane helices 


654 


26.75% 


CRISPR repeats 


0 
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Figure 3. Graphical circular map of chromosome (map of plasmid not shown). From outside to the center: Genes 
on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes 
(tRNAs green, rRNAs red, other RNAs black), GC content, GC skew. 
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Table 4. Number of genes associated with the general COG functional categories 
Code value %age Description 



1 


150 


7.7 


Translation, ribosomal structure and biogenesis 


A 


1 


0.0 


RNA processing and modification 


K 


90 


4.6 


Transcription 


L 


91 


4.7 


Replication, recombination and repair 


B 


1 


0.0 


Chromatin structure and dynamics 


D 


27 


1.4 


Cell cycle control, cell division, chromosome partitioning 


Y 


0 


0.0 


Nuclear structure 


V 


31 


1.6 


Defense mechanisms 


T 


80 


4.1 


Signal transduction mechanisms 


M 


79 


4.1 


Cell wall/membrane/envelope biogenesis 


N 


23 


1.2 


Cell motility 


Z 


0 


0.0 


Cytoskeleton 


W 


0 


0.0 


Extracellular structures 


U 


47 


2.4 


Intracellular trafficking, secretion, and vesicular transport 


o 


82 


4.2 


Posttranslational modification, protein turnover, chaperones 


c 


154 


7.9 


Energy production and conversion 


G 


125 


6.4 


Carbohydrate transport and metabolism 


E 


203 


10.4 


Amino acid transport and metabolism 


F 


72 


3.7 


Nucleotide transport and metabolism 


H 


93 


4.8 


Coenzyme transport and metabolism 


[ 


66 


3.4 


I inid fran^norf and mpfaholi^m 


P 


100 


5.1 


Inorganic ion transport and metabolism 


Q 


31 


1.6 


Secondary metabolites biosynthesis, transport and catabolism 


R 


244 


12.5 


General function prediction only 


S 


155 


8.0 


Function unknown 




673 


27.6 


Not in COGs 
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